-
Notifications
You must be signed in to change notification settings - Fork 54
Introduce file staging delegation via JSON staging manifest #399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This should be the general strategy for collecting input and output files for ARC, DIRAC, AWS batch etc.
@mvdbeek That's all I needed to build the ARC integration (coming soon as a different PR). You may want to compare it to your branch offline_connector_marius. I stripped out the Pulsar client (that comes in the other PR), the tests (please let me know which tests should make it into the PR) and a couple of things that were not needed to complete the integration. |
Change base image from `conda/miniconda3` (based off Debian Stretch) to `python:3.12-bookworm`. Miniconda is not required in the base image. Add the Galaxy Depot repository, which provides SLURM DRMAA packages for Debian Buster and newer releases. Do not install the package `apt-transport-https`, it is now a dummy package, see https://packages.debian.org/en/bookworm/apt-transport-https. Install the package `slurm` instead of `slurm-llnl`. Newer versions of the `munge` package include the binary `/usr/sbin/mungekey` instead of `/usr/sbin/create-munge-key`. Nevertheless, the key seems to be created automatically when installing the package, as running `mungekey` yields 'mungekey: Error: Failed to create "/etc/munge/munge.key": File exists'.
Build wheel automatically when building the Docker image. Exclude the source code from the output image through a multistage build.
…oexecutionLaunchMixin` to `BaseRemoteConfiguredJobClient`
1e1a3c7
to
7c8371a
Compare
If this one is fine, then #370 should be closed. |
@@ -0,0 +1,45 @@ | |||
import argparse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment at the top of this file would be appreciated as well.
docker/coexecutor/Dockerfile
Outdated
FROM conda/miniconda3 | ||
FROM python:3.12-bookworm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is fair at all to say "Miniconda" is not required in the base image as the commit message suggests. I get that it isn't the modality that you wish to run it in but it is documented in https://pulsar.readthedocs.io/en/latest/containers.html#co-execution as an option and it is an option that makes a lot of sense to me. Is the slurm stuff required for your use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is fair at all to say "Miniconda" is not required in the base image as the commit message suggests.
I am assuming "Miniconda" is not required in the base image because I saw line 60 (the last line): RUN _pulsar-conda-init --conda_prefix=/pulsar_dependencies/conda
. It looks like it installs Miniconda, thus it should make it to the Docker image without needing to put it in the base image. But I am not familiar with Pulsar so I am not sure if line 60 actually has an effect equivalent to including it in the base image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the SLURM stuff, it is not needed for this use case but it was already in the old Dockerfile (it is needed for other use cases).
# Install packages
&& apt-get update \
&& apt-get install -y --no-install-recommends gcc \
libcurl4-openssl-dev \
cvmfs cvmfs-config-default \
slurm-llnl slurm-drmaa-dev \
bzip2 \
Because I am changing the base image to Debian Bookworm, the changes are required not to break the build with the new base image (that's why the whole commit has the title "Change base image in coexecutor Dockerfile"). slurm-drmaa1
is not available on the Debian repos anymore for Bookworm (it has to be installed from the Galaxy Depot). The latest version of slurm-drmaa1
available in the Galaxy depot (1.1.4) requires a library libslurm36
from Debian Bullseye. There exists a new version 1.1.5 of libslurm-drmaa1
on GitHub that maybe works on Bookworm without libraries from Bullseye, but @natefoo has not published a Debian package yet.
Introduce a JSON transfer action that indicates that the pulsar server should create a JSON manifest that can be used to stage files by an external system that can stage files in and out of the compute environment. This should be the general strategy for collecting input and output files for ARC, DIRAC, AWS batch etc.
Before launching jobs, Pulsar clients receive the staging manifest as a dictionary. When the job is complete, the script
pulsar-create-output-manifest
can be used to create a manifest declaring which files should be staged out by the external system.This PR also updates the coexecutor Dockerfile so that it works again and so that it builds the Pulsar wheel (no need to build it separately anymore).